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(57) Abstract 

The hPMS2 gene encodes a protein which is involved in DNA mismatch repair and is mutated in a subset of patients with hereditary 
nonpolyposis colon cancer (HNPCC). The previously published hPMS2 cDNA sequence lacks an upstream in-frame stop codon preceding 
the presumptive initiating methionine. To further evaluate the 5* terminus of the hPMS2 coding region, we isolated additional cDNA 
clones, RT-PCR products, and the corresponding 5' genomic segment of the hPMS2 locus. The hPMS2 gene transcripts were found to 
have heterogeneous but collinear V termini, one of which contained an in-frame termination codon preceding the initiating methionine. In 
addition, a gene encoding a 34.5 kDa polypeptide was found to transcriptionally initiate within hPMS2 from the opposite strand. 
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peptides from the 85 kDa protein revealed it to be the product of hMLHl, and this 
protein's molecular weight agreed with that predicted from the cDNA sequence 
(Bronner eual., 1994; Papadopoulos et.al., 1994). The sequence of the peptide 
generated from the 110 kDa component showed it to be similar to the hPMSZ 
mutL-homolog; however, the predicted molecular weight of hPMS2 is only 95 kDa 
(Nicolaides, eual. t 1994). Since the previously isolated hPMS2 cDNA clones 
larked an in-frame termination codon upstream of the presumptive initiating 
methionine, it was possible that the open reading frame extended further upstream* 
Thus there is a need in the an for further knowledge of the genetic structures of 
and adjacent to the known KPMS2 gene* 
SUMMARY OF THE INVENTION 

It is an object of the invention to provide a novel, isola te d, human gene on 
chromosome 7. 

It is an object of the invention to provide vectors and host cells for making 
a novel human gene product 

It is another object of the invention to provide compositions of matter 
containing the human gene product. 

These and other objects are provided by one or more of the embodiments 
described below. In one embodiment of die invention, a segment of cDNA is 
provided. The cDNA consists of the sequence of nucleotides shown in Figure 2. 

According to another embodiment of the invention, a vector comprising the 
segment of cDNA which consists of die sequence of nucleotides shown in Figure 
2 is provided," as well as host cells comprising the vector. 

According to still another embodiment of the invention, a composition is 
provided. The composition consists essentially of a protein consisting of the amino 
acid sequence shown in Figure 2 

In yet another embodiment of the invention a composition of protein JTV1 
as shown in Figure I is provided. The composition is free of other human 
proteins. 
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In another embodiment of the invention a segment of cDNA is provided 
which segment encodes the amino acid sequence of JTV1 protein shown in 
Figure 2. 

cDNA probes are also provided by the present invention. The cDNA 
portion of said probes consists of between IS and 1176 contiguous nucleotides of 
the sequence shown in SEQ ID NO:L 
BRIEF DESCRIPTION OF TH E DRAWINGS 

Figure 1 shows the sequence of the 5 1 region of hPMS2 and predicted 
coding region. The arrow indicates the 5* end of the previously published cDNA 
clone. The presumptive initiating methionine is underlined. 

Figure 2 shows the sequence of JTVL The sequence has been deposited 
in Genbank, accession number U24169. The presumptive initiating methionine is 

Figure 3 demonstrates the genomic localization of J7V1. The genomic 
localization of KPMS2 and JTV1 were confirmed by screening somatic-cell hybrids 
containing various regions of human chromosome 7. Lane 1, GM10791 contains 
entire chromosome 7 in a Chinese hamster ovary (CHO) background; lane 2, 
NA11440 contains 7pter>7p22 in a CHO background; lane 3 9 Ru-Rag4-13 
contains 7cen-7pter in a murine background; lane 4, 4AF1/106/K015 contains 
7cen-qter in a marine background; lane 5, GM05184.17 contains 7q2L2-qter in 
a CHO background; lane 6, 2Q68Rag22-2 contains 7q22-qter in a murine 
background; lane 7, human genomic DNA; lane 8 f mouse genomic DNA; lane 9, 
CHO genomic DNA. 

Figure 4 demonstrates the mapping of transcriptional start sites of hPMS2 
and JTVL Sequence of the genomic region containing the 5' ends of the two 
genes is shown. The sequence is numbered in respect to codon 1 of hPMS2. 
Lower case letters denote intronic sequence of JTV1 (from nr. -479 to -833) and 
hPMS2 (from +24 to +108). Arrows indicate the 5' ends of HPMS2 (sense 
strand) and of 77V/ (antisense strand) cDNA clones. The underlined ATG codons 
indicate the predicted initiating methionines for hPMS2 (at nt + ] on the sense 
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strand) and JTV1 (at nt -345 on the antisense strand). The sequence has been 
deposited in GenbanL accession number U24168. 

Figure 5 shows the egression of KPMS2 and JTVL RNA from various 
tissues was incubated with reverse transcriptase (RT+) or in control reactions 
without reverse transcriptase (RT-). The cDNA was used as template for PGR 
with primers specific for hPMS2 (A) and JTV1 (B). RT-PCR products were 
separated by poiyacrylamide gel electrophoresis. 

DETATTFn INSCRIPTION OF THE PREFERRED EMBODIMENTS 

To investigate the upstream region from APAfS2, we isolated additional 
cDNA clones, analyzed the S v end of hPMS2 transcripts with PCR-based 
techniques, and cloned the corresponding genomic segments. In addition to 
clarifying the transcript, we serendipitously discovered a previously undescribed 
gene overlapping hPMS2. That gene is termed herein JTV1 . The sequences of the 
JTV1 cDNA and protein are shown in SEQ ID NOSrl and 2, respectively. 

A segment of cDNA according to the pres e n t invention refers to a 
contiguous mrh of deoxyribonucieotides which have a sequence as obtained upon 
reverse transcriptase of an RNA transcript. Such segments do not contain introns. 
The segment may be an isolated molecule or it can be covalently joined to other 
nucleic acid sequences. The segment may, for example, be replicated as part of 
a vector, such as a plasmid, virus, or minichromosome. The vector may be 
replicated within a host cell, such as a cell transformed by a recombinant DNA 
molecule. The host cell may be used to produce JTV1 protein. It can also be 
used to study regulation of expression of JTV1 sequences, for example by 
subjecting the host cell to various agents which may or may not affect the 
expression. Although the DNA sequence is discussed with particularity herein, it 
is well within the skill of the an to make small mutations t such as single nucleic 
acid substitutions of one of the other three nucleic acid bases, at any of the 
positions of the sequence. In addition, it is well within the art to make single base 
deletions or single base insertions, to study the effect upon protein structure and 
function. 
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If JTV1 is produced in a recombinant host cell which is not human, a 
composition of JTV1 protein will be produced which is free of other human 
proteins. If JTV1 protein is isolated from naturally producing ceils, or from 
human host cells, then the protein can be purified, for example* using antibodies 
which are raised against an immunogen comprising JTV1 amino acid sequence. 
Any other means of purification known ir« the an can be used, as is desired. 

DNA molecules can be made having different nucleotide sequences from 
that disclosed in SEQ ID NO:l, but which still encode the JTV1 protein as 
disclosed in SEQ ID NO:2. Using the known coding relationships between codons 
and amino acids and the disclosed amino acid sequence, numerous other sequences 
can be readily designed and produced. Such DNA molecules are within the 
contemplation of the subject invention. 

cDNA probes can be used for hybridization studies. Typically they are 
labeled with a detectable marker, such as a radiolabd or a fluorescent moiety, 
although they need not be. The cDNA probes of the subject invention consist of 
at least 15 contiguous nucleotides of the sequence shown in SEQ ID NO:L If 
greater specificity is desired, larger molecules of 18, 20, 25, or 30 nucleotides can 
be used, up to a maximum of the entire sequence of 1176 nucleotides. 

JTVI cDNAs can be used as probes to detect deletions in chromosome 7. 
Due to the overlapping promoter regions, large deletions of JTVI would also be 
expected to affect PMS2 expression, leading to Hereditary Non-Polyposis 
Colorectal Cancer (HNPCC). JTVI cDNA can be used in chromosome mapping. 
It can also be used to assay activity or competence of the PMS2 promoter region. 
The presence of JTVI transcripts or JTVI protein suggests that the PMS2 promoter 
is intact. If the PMS2 promoter is intact and PMS2 products are absent, a 
structural defect in the coding region is indicated. 

JTVI sequences can be used to guide homologous recombination at the 
PMS2 locus. For example, where a PMS2 mutation is present and therapeutic 
replacement with a wild-type gene is desired. PMS2 sequences can be used to 
provide an adjacent region of homology. Similarly, it may be desirable to target 
other genes to the region adjacent to PMS2. JTVI sequences can be used to flank 
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such other genes, providing one or more regions of homology. If insertion of 
other genes is desired between the 7717 and the PMS2 sequences, again, this can 
be accomplished using the identified sequences as homology units for homologous 
recombination. 

Examples 

Example Jl 

Isolation and sequence analysis of the 5' end of hPMS2. 

Purified DNA from PI clone 53, previously determined to contain the 
hPMS2 gene (Nicolaides, eLaL, 1994), was digested with EcoRI and subcloned 
into the pBluescript vector (Stratagene). Clones containing the 5' region of hPMS2 
were identified by hybridization with primer A (Table 1) directed to exon 1. 
Restriction analysis of several positive clones showed them to be identical. The 
sequence of the relevant region of hPMS2 was determined from both strands using 
*S a-dATP and Sequenase (USB). 
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Table 1. Primers used for hPMS2. 



PRIMER NAME 


STRAND 


PRIMER SEQUENCE 


POSITION* 


A 


sense 


5'- cgggtgttgcatccatgg-3' 


-14 - +4 


B 


sense 


5'-gggtggagcacaacgteg -3' 


-110 - -93 


C 


sense 


5'-ggtcacgacggagaccg-3* 


-283 - -267 


D 


sense 


5'-tgcaggtgggaagctccacacgg-3 ' 


-414 - -392 


E 


sense 


5 ' -tagctcctgccgtgcacg-3 ' 


-448 - -431 


F 


sense 


5 ' -cgctcctacctgcacgtg-3 ' 


-487- -470 


G 


antisense 


5 ' -tagactcagtaccacctgc-3 ' 


+90- +107 


H 




5'-tacagaacctgctaaggcc-3* 


+24 - +42 


I 


antisense 




+116 - +136 


J 


sense 


5 ' -caaccatgagacacatcgc-3 ' 


+2545- 


K 


antisense 


5 '-aggttagtgaagactOgtc-3 ' 


+2647- 
+2666 



* Relative to the presumptive initiating methionine in Figure 1. 



Three clones were isolated, each containing an 8.5 kb EcoRI insert. Partial 
sequence analysis of one clone, pSMN, determined that it contained coding 
residues of hPMS2 as well as sequences upstream of the previously designated 
codon 1. The presumptive initiating codon reported previously has been 
designated as nucleotide 1 in Figure 1. The sequence of hPMS2 was extended 833 
bp upstream of nucleotide I. This sequence revealed an in-frame stop codon 321 
nts upstream of the published initiator methionine, with no intervening methionines 
(Figure 1). 
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Example 2 

Isolation of additional cDNA clones using hPMS2 probes. 

Two cDNA libraries were screened with a probe containing nt + 24 to 
+136 of hPMS2 generated by PCR using PI clone 53 as template and the primers 
H and I (Table 1). A human small intestine random-primed cDNA library in 
XGT1° (Clontech) and a HeLa oligo-dT primed cDNA library in XZAPII 
(Stratagene) were screened as described except hybridizations were carried out at 
68°C and filters were washed at 65 °C for 0.5 hrs (Kinzler and Vogelstein, 1989). 
Following plaque purification, the EcoRI inserts from the small intestine library 
were subcioned into pBluescript vector, while the HeLa cDNA inserts were 
rescued as phagemids following the manufacturer's protocol (Stratagene). 

One clone was isolated from the random-primed small intestine library, and 
this contained nt -14 to nt 4*1668 of hPMS2. Two clones were isolated from the 
oligo-dT primed HeLa cDNA library. The clones began at nt -53 and ended at 
either nts +2722 or +2749. The HeLa cDNA library was also screened with a 
430 bp probe from the 5* genomic region of KPMS2, containing nt -414 to +16, 
generated by PCR from PI clone 53 using primers D (Table 1) and O (Table 2). 
The same two denes were identified, as expected- However, twelve other 
overlapping clones were found and appeared to represent a different transcript, 
named JTV1 (Figure 2). These twelve cDNAs were approximately L2 kb in 
length and were sequenced in their entirety. All twelve ended with a polyA tract 
(assumed to be the 3' end) and were identical for L2 kb upstream. The 5' ends 
were located within 38 bp of each other. Comparison with HPMS2 indicated that 
JTV1 was transcribed from the opposite strand. 
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Table 2. Primers used for JTV-1 cDNA amplification. 



PRIMER NAME 


STRAND 


PRIMER SEQUENCE 


POSITION* 


L 


sense 


5 '-gttctgccatgccgatg-3 * 


-8- +9 


M 


sense 


5 ' -ggcctttggcacgcgctac-3 • 


-23 - -41 


N 


sense 


S*accggactgcgttttcccg-3 * 


-111 - -129 


O 


sense 


5'-tctcagctcgctccatgg-3' 


-343 - -360 


P 


antisense 


5 • -gcagagacaggttagactc-3 9 


+139 - +157 


Q 


sense 


5'-gctccttaagtgaattgccg-3' 


+952 - +971 I 


R 


antisense 


5*-tgacacttgacaactggcc-3' 


+1068 - 
+1086 



* Relative to the presumptive initiating methionine in Figure 2. 



mL 

The length of one clone representative oiJTVl (pM23NNFL) was 1233 bp 
and encoded an open reading frame (ORF) of 936 bp (Figure 2). The first 
methionine within this ORF was designated codon 1 (Figure 2) and was preceded 
by an in-frame termination codon 66 bp upstream. This methionine had a 
reasonable match to the Kozak translation initiation consensus (Kozak, 1986). The 
3' end contained a polyadenylation signal (AAUAAA) starting at nucleotide 1086 
followed by a polyA tail. The transcript was predicted to encode a polypeptide of 
312 amino acids, with a molecular weight of 34.5 kda. Searches of nucleotide and 
peptide sequence databases showed that this was a novel gene, with limited 
homology to the glutathione S-transferase gene family. 



WO 97/08312 



PCT/US96/I3598 



- 10 - 

Example 4 

Chromosomal Mapping of JTV1. 

The hPMS2 locus was previously mapped to chromosome 7p22 by FISH 
using PI clone 53 (Nicolaides et.al., 1994). Because multiple hPMS2-idatcd 
genes are located on the long arm of chromosome 7 and have conserved 5' regions 
(personal observation. Hon et.al., 1994), we confirmed the genomic localization 
of JTVI by PGR analysis of rodent-human somatic cell hybrid DNAs containing 
various regions of chromosome 7 (Scherer etal., 1993; Powers et.aL, 1993), 
PCR primers were chosen from the 3' untranslated region of hPMS2 and JTVI and 
shown to amplify genomic DNA. hPMS2 primers J and K yielded a 121 bp 
product and JTVI primers Q and R yielded a 134 bp product. PCR products for 
both genes were formed in those DNAs containing the 7p22 region: lines 
GM10791 (containing the entire human chromosome 7), NA 11440 (Coriell 
Institute) (7p22>7pter) and Ru-Rag4-13 (7cen-7pter) (figure 3, lanes 1, 2, and 3). 
No products were observed in lines 4AF1/1G6/K015 (7cen-qter), GM05184.17 
(7q2L2-qter), or 2068Rag22-2 (7q22-qter) (figure 3, lanes 4, 5, and 6). 

Analysis of the S' Termini of hPMS2 and JTVI. 

The 5' termini of hPMS2 transcripts were studied by standard cDNA 
cloning, RACE, and RT-PCR analyses. RNA was purified from tissues and cells 
using a guanidine isothiocyanate based method (Chomczynski and Sacchi, 1987). 
Reverse transcriptase-polymerase chain reaction (RT-PCR) was performed using 
randomly primed cDNA as template as described (Leach, etal., 1993). RT-PCR 
of the 5* end of hPMS2 was performed using a common antisense primer (I) and 
the sense primers (A-F) described in Table 1. RT-PCR mapping of the 5* end of 
JTVI was done using a common antisense primer P and the sense primers L-0 as 
described in Table 2. RACE (rapid amplification of cDNA ends, Frohman, et.al., 
1988) was performed on hPMS2 using sequential antisense primers I and G (Table 
1) following the manufacturer's protocol (Clontech). RACE analysis of JTVI was 
done using the antisense primer P (Table 2). Amplification products were cloned 
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into a T-tailed vector (InVitrogen) and sequenced using SP6 and 77 primers. 
Amplifications were done at 95°C for 30 sec, 56°C for L5 min., and 70°C for 
1.5 min for 35 cycles. Reaction products were separated by electrophoresis in 6% 
nondenaturing polyacrylamide gels. 

Figure 4 shows the sequence of the genomic region containing the 
transcriptional initiation sites of both hPMS2 and JTWy numbered as in Figure 1 
with respect to hPMSl. The 5' ends of hPMS2 cDNA clones are marked with 
arrowheads on the top strand. One clone began at nt -14, one at nt -24, and two 
at nt -53. RACE products were generated from adult brain, leukocyte, and 
placenta mRNA. Using an antisense primer corresponding to nt +116 to +136, 
multiple bands with approximately 160 to 191 bps were observed in addition to 
less intense bands of up to 550 bp. The sequence of four cloned RACE products 
demonstrated that, as expected, their 5' aids were located between nt -25 to -55. 
These data suggested that the majority of hPMS2 transcripts initiated between nt - 
13 to -55, with a minority extending further upstream. This was confirmed by 
RT-PCR analysis using mRNA from HeLa cells as template. Robust RT-PCR 
products were amplified with sense primers whose 5' ends were at nt -14, -110, 
-283, and -414, (primers A, B, C, and D; Table 1) and an antisense primer 
corresponding to nt +90 to +107 (G). No PCR products were observed using 
sense primers whose 5' ends were at nt -448 or -487 (primers E and F). To 
ensure that primers E and F were not defective, successful amplification of 
genomic DNA was performed using these primers and an antisense primer (O) 
corresponding to nt -2 to +16. 

The5* termini of JTV1 showed a heterogeneous pattern like that of hPMS2. 
The 5' ends of the 12 cDNA clones are indicated by arrowheads on the bottom 
strand in figure 4. They were located 73 to 113 nt 73 upstream of codon 1 of 
JTV1, which corresponded to nt -271 to -232 of hPMS2. RACE confirmed the 
cDNA results in chat the majority products generated using an antisense primer 
P corresponding to JTV1 nt +157 were 230 to 270 bp. RT-PCR analysis was 
performed with antisense primer P and several sense primers (L-O) listed in Table 
2. PCR products were found with sense primers whose 5' ends were at -8, -23, 
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and -111, (primers L,M, and N) but not with a sense primer O whose 5' end was 
at nt -360 with respect to JTVl t nt +1. The latter primer was not defective, as 
a genomic segment could be successfully amplified with it. 

Transcripts of hPMS2 had heterogeneous but collinear 5* termini, 
containing 11 to 415 nt of presumably untranslated sequence. The transcripts 
contained an in-frame stop codon upstream of the presumptive initiating 
methionines (Figure 1), making the originally described methionine the most likely 
translation initiator. Because no other upstream coding regions of hPMS2 
appeared to exist, the size discrepancy between that predicted from the hPMS2 
sequence and the 110 kDa hPMS2 protein identified by Li and Modrich is likely 
due to post-transcriptional modifications or alternative internal exons. 

Our results revealed that KPMS2 overlaps with a novel gene, JTVI, 
transcribed from the opposite strand (Figure 4). This organization is similar to 
that of HUMDUG, a /radS-homolog found on human chromosome 5, and the 
dihydrofolate reductase (DHFR) gene (Fujii and Shimada, 1989). Both hPMS2- 
JTVI and HUMDUG-DHFR lie in a head to head arrangement, both genes are 
ubiquitously expressed, and both have multiple 5' termini. It has been 
hypothesized that DHFR and HUMDUG may be regulated via a bidirectional 
promoter, because a minor subset of the transcripts from the two genes overlap. 
The major transcripts of HUMDUG and DHFR, however, do not overlap, as is 
true for HPMS2 and JTVI. It will be of interest to determine whether other 
mismatch repair genes are arranged in a head to head fashion with a contiguous 
gene and if JTVI is involved in DNA replication or repair. 

Example 6 

Expression of hPMS2 and JTVI. 

The expression of hPMS2 and JTVI was analyzed in a variety of mRNA 
samples prepared from human tissues. RT-PCR was performed on cDNA 
templates derived from adult brain, leukocytes, kidney, large intestine, colon, 
salivary gland, lung, testes and prostate using primers J and K for HPMS2 anc< 



WO 97/08312 PCT/US96/13598 

- 13 - 

primers Q and R for JTV1 (Tables 1 and 2). Both genes were expressed in all 
tissues tested (Figure 5). 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Vogelstein, Bert 

Kinzler W. , Kenneth 
Nicolaides C. f Nicholas 

<ii) TITLE OF INVENTION: Human JTV1 Gene Overlaps PMS2 Gene 

(ill) NUMBER OF SEQUENCES: S 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Banner & Allegretti, LTD. 

(B) STREET: 1001 G Street, NW 

(C) CITY: Washington DC 

(E) COUNTRY: U.S.A. 

(F) ZIP: 20001 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS— DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Kagan A. , Sarah 

(B) REGISTRATION NUMBER: 32 , 141 

(C) REFERENCE /DOCKET NUMBER: 1107.49697 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 202-508-9100 

(B) TELEFAX: 202-508-9299 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 384 base pairs 

(B) TYPE: nucleic acid 

( C X- "STRANDED NESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 46.. 384 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

TTACCTGGTA CATCGGCATG GCAGAACCAA AGCAAAAGGG GGTAG CGC GTG CCA 54 

Arg Val Pro 
1 

AAG GCC AAC OCT GAG AAA COG TCA GAG GTC ACG ACG GAG ACC GGC CAC 102 
Lys Ala Asn Ala Gin Lys Pro Ser Glu Val Thr Thr Glu Thr Gly His 
5 10 15 

CTC CCT TCT GAC CCT GCT CCG GGC GTT CGG GAA AAC GCA GTC CGC TGT 150 
Leu Pro Ser Asp Pro Ala Ala Gly Val Arg Glu Asn Ala Val Arg Cys 
20 25 30 35 

GCT CTG ATT GGC CCA GGC TCT TTG ACG TCA CCA AGT CGA CCT TTG ACA 198 
Ala Leu lie Gly Pro Gly Ser Leu Thr Ser Arg Ser Arg Pro Leu Thr 
40 45 50 

GAG CCA ATA GGC GAA AAG GAG AGA CGG GAA GTA TTT TTG CCG CCC CGC 246 
Glu Pro lie Gly Glu Lys Glu Arg Arg Glu Val Phe Leu Pro Pro Arg 
55 60 65 

COG GAA AGG GTG GAG CAC AAC GTC GAA AGC AGC CAA TGG GAG TTC AGG 294 
Pro Glu Arg Val Glu His Asa Val Glu Ser Ser Gin Trp Glu Phe Arg 
70 75 80 

AGG CGG AGC GCC TGT GGG AGC CCT GGA GGG AAC TTT CCC AGT CCC CGA 342 
Arg Arg Ser Ala Cys Gly Ser Pro Gly Gly Asn Phe Pro Ser Pro Arg 
85 90 95 

GGC GGA TCG GGT GTT GCA TCC ATG GAG CGA GCT GAG AGC TOG 384 
Gly Gly ser Gly val Ala Ser Mot Glu Arg Ala Glu Ser Ser 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 2: 

Arg Val Pro Lys Ala Asn Ala Gin Lys Pro Ser Glu Val Thr Thr Glu 
15 10 15 

Thr Gly His Leu Pro Ser Asp Pro Ala Ala Gly Val Arg Glu Asn Ala 
20 25 30 

Val Arg Cys Ala Leu lie Cly Pro Gly Ser Leu Thr Ser Acg Ser Arg 
35 40 45 

Pro Leu Thr Glu Pro lie G3y Glu Lys Glu Arg Arg Glu Val Phe Leu 
50 55 60 

Pr Pro Arg Pro Glu Arg Val Glu His Asn Val Glu Ser Ser Gin Trp 
65 70 75 80 



Glu Phe Ac? Arg Arg Ser Ala c:ys Gly Ser Pro Gly Gly Asn Phe Pro 
85 90 95 
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Ser Pro Arg Gly Gly Ser Gly Val Ala Ser Met Glu Arg Ala Glu Ser 
100 105 110 

Ser 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1233 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURES 

(A) NAME /KEY: COS 

(B) LOCATION: 114.. 1049 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3: 

CCGAACGCCC GCAGCAGGGT CAGAAGGGAG GTGGCCGGTC TCCGTCGTGA CCTCTGACGG 60 

TTT C T G AGCC TTGGCCTTTG GCACCCCCTA CACCCTTTTG CTTTGGTTCT CCC ATG 116 

Met 
1 

CCG ATG TAC CAG GTA AAG CCC TAT CAC CGG GGC CCC CCG CCT CTC CGT 164 
Pro Met Tyr Gin Val Lys Pro Tyr His Gly Gly Gly Ala Pro Leu Arg 
5 10 IS 

GTG GAG CTT CCC ACC TGC ATG TAC CGG CTC CCC AAC GTG CAC GGC AGG 212 
Val Glu Leu Pro Thr Cys Met Tyr Arg Leu Pro Asn Val His Gly Arg 
20 25 30 

AGC TAC GGC CCA GCG CCG GGC GCT GGC CAC GTG CAG GAA GAG TCT AAC 260 
Ser Tyr Gly Pro Ala Pro Gly Ala Gly His Val Gin Glu Glu Ser Asn 
35 40 45 

CTG TCT CTG CAA. GCT CTT GAG TCC CGC CAA GAT GAT ATT TTA AAA CGT 308 
Leu Ser Leu Gin Ala Leu Glu Ser Arg Gin Asp Asp lie Leu Lys Arg 
50 55 60 65 

CTG TAT GAG TTG AAA GCT GCA GTT GAT GGC CTC TCC AAG ATG ATT CAA 356 
Leu Tyr Glu Leu Lys Ala Ala Val Asp Gly Leu Ser Lys Met lie Gin 
70 75 80 

ACA CCA GAT GCA GAC TTG GAT GTA ACC AAC ATA ATC CAA GCG GAT GAG 404 
Thr Pro Asp Ala Asp Leu Asp Val Thr Asn lie He Gin Ala Asp Glu 
85 90 95 

CCC ACG ACT TTA ACC ACC AAT GCG CTG GAC TTG AAT TCA GTG CTT GGG 452 
Pro Thr Thr L u Thr Thr Asn Ala Leu Asp Leu Asn Ser Val Leu Gly 
100 105 HO 
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AAG GAT TAC GCG GCG CTG AAA GAC ATC GTG ATC AAC GCA AAC CCG GCC 500 
Lys Asp Tyr Gly Ala Leu Lys Asp lie Val lie Asn Ala Asn Pro Ala 
115 120 125 

TCC CCT CCC CTC TCC CTG CTT GTG CTG CAC AGG CTG CTC TGT GAG CAC 548 
Ser Pro Pro Leu Ser Leu Leu Val Leu His Arg Leu Leu Cys Glu Hia 
130 135 140 145 

TTC AGG CTC CTG TCC ACG GTG CAC ACG CAC TCC TCG GTC AAG AGC GTG 596 
Phe Arg Val Leu Ser Thr Val His Thr His Ser Ser Val Lys ser Val 
150 155 160 

CCT GAA AAC CTT CTC AAG TCC TTT GGA GAA CAG AAT AAA AAA CAG CCC 644 
Pro Glu Asn Leu Leu Lys Cys Phe Gly Glu Gin Asn Lys Lys Gin Pro 
165 170 175 

CGC CAA GAC TAT CAG CTG GGA TTC ACT TTA ATT TGG AAG AAT GTG CCG 692 
Arg Gin Asp Tyr Gin Leu Gly Phe Thr Leu lie Trp Lys Asn Val Pro 
180 185 190 

AAG ACG CAG ATC AAA TTC AGC ATC CAG ACG ATG TGC CCC ATC GAA GGC 740 
Lys Thr Gin Met Lys Phe Ser lie Gin Thr Met Cys Pro lie Glu Gly 
195 200 205 

GAA GGG AAC ATT GCA CGT TTC TTG TTC TCT CTG TTT GGC CAG AAG CAT 78B 
Glu Gly Asn lie Ala Arg Phe Leu Phe Ser Leu Phe Gly Gin Lys His 
210 215 220 225 

AAT GCT GTC AAC GCA ACC CTT ATA GAT AGC TGG GTA CAT ATT GCG ATT 836 
Asn Ala Val Asn Ala Thr Leu He Asp Ser Trp Val Asp He Ala He 
230 235 240 

TTT CAG TTA AAA GAG GGA AGC ACT AAA GAA AAA GCC GCT CTT TTC CGC 884 
Phe Gin Leu Lys Glu Gly Ser Ser Lys Glu Lys Ala Ala Val Phe Arg 
•245 250 255 

TCC ATG AAC TCT CCT CTT GGG AAG AGC CCT TGG CTC GCT GGG AAT GAA 932 
Ser Met Asn Ser Ala Leu Gly Lys Ser Pro Trp Leu Ala Gly Asn Glu 
260 265 270 

CTC ACC GTA CCA GAC GTG GTG CTG TGG TCT GTA CTC CAG CAG ATC GCA 980 
Leu Thr Val Ala Asp Val .Val Leu Trp Ser Val Leu Gin Gin He Gly 
275 280 285 

GGC TGC AGT GTG ACA GTG CCA GCC AAT GTG CAG AGG TGG ATG AGG TCT 1028 
Gly Cys Ser Val Thr Val Pro Ala Asn Val Gin Arg Trp Met Arg Ser 
290 m 295 300 305 

TGT GAA AAC CTG GCT CCT TTT TAACACGGCC CTCAAGCTCC TTAAGTGAAT 1079 
Cys Glu Asn Leu Ala Pro Phe 
310 

TCCCGTAACT GATTTTAAAG GGTTTAGATT TTAAGAATGG TGCTCTTTCf TGCCTATT^T 1139 

CAGTAAGGGG ACTTGTATTA GAGTCAGAGT CTTTTTATTT AGGCCAGTTG TCAAGTGT^ 1199 

ATAAAAGCGC ATCATGTAAT TTAAAAAAAA AAAA 1233 

(2) INFORMATION FOR SEQ ID NO! 4: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 312 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 : 

Met Pro Met Tyr Gin Val Lye Pro Tyr His Gly Gly Gly Ala Pro Leu 
IS 10 15 

Arg Val Glu Leu Pro Thr Cys Met Tyr Arg Leu Pro Asn Val His Gly 
20 25 30 

Arg Ser Tyr Gly Pro Ala Pro Gly Ala Gly His Val Gin Glu Glu Ser 
35 40 45 

Asn Leu Ser Leu Gin Ala Leu Glu Ser Arg Gin Asp Asp lie Leu Lys 
50 55 60 

Arg Leu Tyr Glu Leu Lys Ala Ala Val Asp Gly Leu Ser Lys Met lie 
65 70 75 80 

Gin Thr Pro Asp Ala Asp Leu Asp Val Thr Asn He He Gin Ala Asp 
65 90 95 

Glu Pro Thr Thr Leu Thr Thr Asn Ala Leu Asp Leu Asn Ser Val Leu 
100 105 110 

Gly Lys Asp Tyr Gly Ala Leu Lys Asp He Val He Asn Ala Asn Pro 
115 120 125 

Ala Ser Pro Pro Leu Ser Leu Leu Val Leu His Arg Leu Leu Cys Glu 
130 135 140 

His Phe Arg Val Leu Ser Thr Val His Thr His Ser Ser Val Lys Ser 
145 150 1S5 160 

Val Pro Glu Asn Leu Leu Lys Cys Phe Gly Glu Gin Asn Lys Lys Gin 
165 170 175 

Pro Arg Gin Asp Tyr Gin Leu Gly Phe Thr Leu He Trp Lys Asn Val 
180 185 190 

Pro Lys Thr Gin Met Lys Phe Ser He Gin Thr Met Cys Pro He Glu 
195 200 205 

Gly Glu Gly Asn He Ala Arg Phe Leu Phe Ser Leu Phe Gly Gin Lys 
210 215 220 

His Asn Ala Val Asn Ala Thr Leu He Asp Ser Trp Val Asp He Ala 
225 230 235 240 

He Phe G?.n Leu Ly9 Glu GLy Snv Ser Lys Glu Lys Ala Ala Val Phe 
245 250 255 

Arg Ser Met Asn Ser Ala Leu Gly Lys Ser Pro Trp Leu Ala Glv Asn 
260 265 270 

Glu Leu Thr Val Ala Asp Val Val Leu Trp Ser Val Leu Gin Gin He 
275 280 285 
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Gly Gly Cys Ser Val Thr Val Pro Ala Asn Val Gin Arg Trp Met Arg 
290 29S 300 

Ser Cys Glu Asn Leu Ala Pro Phe 
305 310 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEHGTKt 900 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS t double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYP OT HE TICAL: NO 

(iv) ANTI-SENSE : NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURES 

(A) NAME /KEY: mRNA 

(B) LOCATION: complement (1..900) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



ACACCCGGCC 


AATTTCTGTA 


TTTTTAGTAG 


AGACGAGGTT 


TTACCATGTT 


GGCCAGGCTA 


60 


GTCTOGAACT 


CCTGACCTCA 


GGTGATCCGC 


CCGCCTCGGC 


CTCCCAAA0T 


GCTGGGATTA 


120 


CAGGCGTGAG 


CCACGGOGCC 


CGGCCTGGAT 


AAATCTTTTA 


AAAGATAAAA 


GTCTGAGTGA 


180 


GTCCCTGGCC 


CGCCGGCACA 


GATGCCGGGG 


TGGGGCCGTG 


AACCGGTTGG 


GACGCGCTCG 


240 


CTCCGGCCTG 


GGGGGACCCG 


GGCCAGCAGC 


CGGTCGCCGC 


GCGTGCGCAC 


TGGGCGGGGG 


300 


GCCCOGCGCT 


CCTACCTGCA 


CGTGGCCAGG 


CCCGGCGCTG 


GGCCGTAGCT 


CCTGCCGTGC 


360 


ACGTTGGGGA 


GCCGGTACAT 


GCAGGTGGGA 


AGCTCCACAC 


GGAGAGGCGC 


GCCGCCCCCG 


420 


TGATACGGCT 


TTACCTGGTA 


CATCGGCATG 


GCAGAACCAA 


AGCAAAAGGG 


GGTAGCGCGT 


480 


GCCAAAGCCC 


AACGCTCAGA 


AACCGTCAGA 


GGTCACGACG 


GAGACCGGCC 


ACCTCCCTTC 


540 


TGACCCTGCT 


GCGGGCGTTC 


GGGAAAACGC 


AGTCCGGTGT 


GCTCTGATTG 


GCCCAGGCCC 


600 


TTTGACCTCA 


CGAAGTCGAC 


CTTTGACAGA 


GCCAATAGGC 


GAAAAGGAGA 


GACGGGAAGT 


660 


ATTTTTGCCC 


CCCCGCCCCC 


AAAGGGTGGA 


GCACAACGTC 


GAAAGCAGCC 


AATGGGAGTT 


720 


CACGAGGCGG 


AGCGCCTGTG 


GGAGCCCTCG 


AGGGAACTTT 


CCCAGTCCCC 


GAGGCGGATC 


780 


GGGTGTTGCA 


TCCATCGAGC 


CAGCTGAGAG 


CTCGAGCTGA 


GCGGGGCTCG 


CAGTCTTCCG 


640 


GTGTCCCCTC 


TCGCCCGCCC 


TCTTTGAGAC 


CCACCGCATT 


CCAACCTCCC 


TGGAAATGGG 


900 
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CLAIMS 

1. A segment of cDNA consisting of the nucleotide sequence shown 
in Figure 2. 

2. A vector comprising the segment of DNA of claim i. 

3. A host cell which comprises the vector of claim 2. 

4. A composition consisting essentially of a protein consisting of the 
amino acid sequence shown in Figure 2. 

5. A composition of protein JTV1 as shown in Figure 1 f wherein said 
composition is free of other human proteins. 

6. A segment of cDNA which encodes the amino acid sequence of 
JTV1 protein shown in Figure 2. 

7. A cDNA probe wherein said cDNA consists of between IS and 1 176 
contiguous nucleotides of the sequence shown in SEQ ID NO:l. 
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